Constructing a Decision Tree for Graph Structured Data

نویسندگان

  • Warodom Geamsakul
  • Takashi Matsuda
  • Tetsuya Yoshida
  • Hiroshi Motoda
  • Takashi Washio
چکیده

Decision tree Graph-Based Induction (DT-GBI) is proposed that constructs a decision tree for graph structured data. Substructures (patterns) are extracted at each node of a decision tree by stepwise pair expansion (pairwise chunking) in GBI to be used as attributes for testing. Since attributes (features) are constructed while a classifier is being constructed, DT-GBI can be conceived as a method for feature construction. The predictive accuracy of a decision tree is affected by which attributes (patterns) are used and how they are constructed. A beam search is employed to extract good enough discriminative patterns within the greedy search framework. Pessimistic pruning is incorporated to avoid overfitting to the training data. Experiments using a DNA dataset were conducted to see the effect of the beam width, the number of chunking at each node of a decision tree, and the pruning. The results indicate that DT-GBI that does not use any prior domain knowledge can construct a decision tree that is comparable to other classifiers constructed using the domain knowledge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constructing Decision Trees for Graph-Structured Data by Chunkingless Graph-Based Induction

A decision tree is an effective means of data classification from which one can obtain rules that are easy to understand. However, decision trees cannot be conventionally constructed for data which are not explicitly expressed with attribute-value pairs such as graph-structured data. We have proposed a novel algorithm, named Chunkingless Graph-Based Induction (Cl-GBI), for extracting typical pa...

متن کامل

Constructing a Decision Tree for Graph-Structured Data and its Applications

A machine learning technique called Graph-Based Induction (GBI) efficiently extracts typical patterns from graph-structured data by stepwise pair expansion (pairwise chunking). It is very efficient because of its greedy search. Meanwhile, a decision tree is an effective means of data classification from which rules that are easy to understand can be obtained. However, a decision tree could not ...

متن کامل

Constructing Graceful Graphs with Caterpillars

A graceful labeling of a graph G of size n is an injective assignment of integers from {0, 1,..., n} to the vertices of G, such that when each edge of G has assigned a weight, given by the absolute dierence of the labels of its end vertices, the set of weights is {1, 2,..., n}. If a graceful labeling f of a bipartite graph G assigns the smaller labels to one of the two stable sets of G, then f ...

متن کامل

Analysis of Hepatitis Dataset by Decision Tree Graph-Based Induction

We analyzed the hepatitis data by Decision Tree GraphBased Induction (DT-GBI), which constructs a decision tree for graphstructured data while simultaneously constructing attributes for classification. An attribute at each node in the decision tree is a discriminative pattern (subgraph) in the input graph, and extracted by Graph-Based Induction (GBI). We conducted four kinds of experiments usin...

متن کامل

Modelling Decision Problems Via Birkhoff Polyhedra

A compact formulation of the set of tours neither in a graph nor its complement is presented and illustrates a general methodology proposed for constructing polyhedral models of decision problems based upon permutations, projection and lifting techniques. Directed Hamilton tours on n vertex graphs are interpreted as (n-1)- permutations. Sets of extrema of Birkhoff polyhedra are mapped to tours ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003